AM-FM Based Robust Speaker Identification in Babble Noise
نویسندگان
چکیده
Speech babble is one of the most challenging noise interference due to its speaker/speech like characteristics for speech and speaker recognition systems. Performance of such systems strongly degrades in the presence of background noise, like the babble noise. Existing techniques solve this problem by additional processing of speech signal to remove noise. In contrast to existing works, the aim is to improve noise robustness focusing on the features only. To derive robust features, amplitude modulation frequency modulation (AMFM) based speaker model is proposed. The robust features are derived by fusing the characteristics of speech production and speech perception mechanisms. The performance is evaluated using clean speech corpus from TIMIT database combined with babble noise from the NOISEX-92 database. Experimental results show that the proposed features significantly improve the performance over the conventional Mel frequency cepstral coefficient (MFCC) features under mismatched training and testing environments. General Terms Pattern Recognition, Algorithms, Experimentation.
منابع مشابه
Speaker Identification using FM Features
The AM-FM modulation model of speech is a nonlinear model that has been successfully used in several branches of speech-related research. However, the significance of the AM-FM features extracted from this model has not been fully explored in applications such as speaker identification systems. This paper shows that frequency modulation (FM) features can improve speaker identification accuracy....
متن کاملComparison of MFCC and pitch synchronous AM, FM parameters for speaker identification
We study robust pitch synchronous parameters that are derived from envelope and instantaneous frequenciesestimated via a bank of cochlear filters. Closed set Speaker Identification experiments are performed on the SPIDRE corpus with matched and mismatched handsets conditions. The recognizer is based on a hybrid Linear Vector Quantization and Single Layer Perceptron (LVQSLP). Experiments are rep...
متن کاملDoes elderly speech recognition in noise benefit from spectral and visual cues?
Previous research with young adults has shown that temporal (amplitude modulated, AM) cues are sufficient for recognizing speech in quiet but not for speech in noise. Speech perception in noise is more robust when spectral (frequency modulated, FM) cues are provided in addition to AM ones; visual cues (AV) provide an additional benefit. The elderly typically have problems recognizing speech in ...
متن کاملAssessment of single-channel speech enhancement techniques for speaker identification under mismatched conditions
It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched train and test conditions. In this study, we report on evaluation of four different single-channel speech enhancement front-ends for robust SID under such conditions. Speech files from the YOHO database are corrupted with four types of noise including babble, car, factory, and white at five ...
متن کاملVibrotactile Identification of Signal-Processed Sounds from Environmental Events Presented by a Portable Vibrator: A Laboratory Study
Objectives: To evaluate different signal-processing algorithms for tactile identification of environmental sounds in a monitoring aid for the deafblind. Two men and three women, sensorineurally deaf or profoundly hearing impaired with experience of vibratory experiments, age 22-36 years. Methods: A closed set of 45 representative environmental sounds were processed using two transposing (TRH...
متن کامل